The evolution and distribution of species body size* 



Aaron Clauset 1 and Douglas H. Erwin 1,2 

1 Santa Fe Institute, 1399 Hyde Park Road, Santa Fe NM, 87501, USA 
2 Departement of Paleobiology, MRC-121, National Museum of Natural History, 
P. 0. Box 37012, Washington DC, 20013-7012,' USA 

The distribution of species body size within taxonomic groups exhibits a heavy right-tail extending 
over many orders of magnitude, where most species are significantly larger than the smallest species. 
We provide a simple model of cladogenetic diffusion over evolutionary time that omits explicit 
mechanisms for inter-specific competition and other microevolutionary processes yet fully explains 
the shape of this distribution. We estimate the model's parameters from fossil data and find that 
it robustly reproduces the distribution of 4002 mammal species from the late Quaternary. The 
observed fit suggests that the asymmetric distribution arises from a fundamental tradeoff between 
the short-term selective advantages (Cope's rule) and long-term selective risks of increased species 
body size, in the presence of a taxon-specific lower limit on body size. 



Most taxonomic groups show a common distribution of 
species body size 0, [H, Q , with a single prominent mode 
relatively near but not at the smallest species size [ij and 
a smooth but heavy right-tail (often described as a right- 
skew on a log-size scale) extending for several orders of 
magnitude (e.g., Fig. [T]). This distribution is naturally 
related to a wide variety of other species characteris- 
tics with which body size correlates, including habitat, 
life history, life span metabolism Q and extinction 
risk 0]. A greater understanding of the underlying con- 
straints on, and long-term trends in, body size evolu- 
tion may provide information for conservation efforts @ 
and insight about interactions between ecological and 
macroevolutionary processes [j| . 

Studies of body size distributions have suggested that 
the prominent mode may be indicative of a taxon- 
specific energetically optimal body size [13, HH, which 
is supported by microevolutionary studies of insular 
species 12 1. However, evidence for Cope's rule [H, liH. ITij 
- the observation that species tend to be larger than their 
ancestors - and the fact that most species are not close 
to th eir g roup's predicted optimal size (among other rea- 
sons [3), suggest that this theory may be flawed. Al- 
ternatively, species body sizes may diffuse over evolution- 
ary time. If so, Cope's rule alone could cause size dis- 
tributions to exhibit heavy right-tails \M, although size- 
dependent speciation or extinction rates [2|, S [H[ or size- 
neutral diffusion near a taxon-specific lower limit on body 
size [TtJ could also produce a similar shape. Furthermore, 
different mechanisms may drive body size evolution on 
spatial and temporal scales [H, and the importance of 
inter-specific competition to the macroevolutionary dy- 
namics of species body size is not known. 
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We developed a generalized diffusion model of species 
body size evolution, in which the size distribution is the 
product of three macroevolutionary processes (Fig. [I}. 
We combine these processes, each of which has been in- 
dependently studied [l], 0, [l3, H3| , m a single quantita- 
tive framework, estimate its parameters from fossil data 
on extinct terrestrial mammals from before the late Qua- 
ternary [l9|, [2l[ , and test whether this model, or simpler 
variants, can reproduce the sizes of the 4002 known ex- 
tant and extinct terrestrial mammal species from the late 
Quaternary (Recent species) [l8j . 

This model assumes that (1) species size varies over 
evolutionary time as a cladogenetic multiplicative diffu- 
sion process [l|, [l?} : the size of a descendant species xd is 
the product of a stochastic growth factor A and its ances- 
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FIG. 1: Smoothed species body size distribution of 4002 Re- 
cent terrestrial mammals (data from [18j]), showing the three 
macroevolutionary processes that shape the relative abun- 
dances of different sizes. The left-tail of the distribution is 
created by diffusion in the vicinity of a taxon-specific lower 
limit near 2 g, while the long right-tail is produced by the in- 
teraction of diffusion over evolutionary time (including trends 
like Cope's rule) and the long-term risk of extinction from in- 
creased body size. 
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FIG. 2: (A) A schematic illustrating a simple cladogenetic 
diffusion model (see text) of species body size evolution, where 
the size of a descendant species id is related to its ancestor's 
size xa by a multiplicative factor A. (B) Empirical data on 
1106 changes in North American mammalian body size (data 
from [l9j]), as a function of ancestor size, overlaid with the 
estimated model of within-lineage changes, where the average 
log-change (log A) varies piecewise as a function of body size 
(see Appendix IB 2[) . 



tor's size x a, i.e., xd = A xa- For each speciation event, a 
new A is drawn from the distribution F(X), which models 
the total influence on species size changes from all direc- 
tions. A bias toward larger sizes (Cope's rule) appears as 
a positive average log-change to size (log A) > 0, and may 
depend on the ancestor's size. (2) Species body size is re- 
stricted by a taxon-specihc lower limit x m i n [6Ll22l| , which 
we model by requiring that F{\ < x^Jxa) = 0, i.e., the 
largest possible decrease in size for a particular specia- 
tion event is A = x lri \ n / xa- In our computer simulations, 
time proceeds in discrete steps. At each step, exactly 
one new species is produced, which is the descendant of 
a randomly selected species. Finally, (3) every species 
independently becomes extinct with probability p e {x), 
which increases monotonically with size. A schematic 
of the model is shown in Fig. fSJ'V (for technical details 
see Appendix I A ip . 

To make this model appropriately realistic, we esti- 
mated the form of each process from fossil data. The 
lower limit on mammalian body size is near 2 g, close to 
the size of both the Etruscan shrew (Suncus etruscus) 
and the bumblebee bat (Craseonycteris thonglongyai) . 



Fossil evidence suggests that this limit has existed since 
at least the Cretaceous- Tertiary boundary [l^, [U l23j |. 
Further, a limit in this vicinity is supported by both ex- 
perimental [22| and theoretical work 6] on mammalian 
metabolism. 

Away from this limit, mammalian body size evolu- 
tion is governed mainly by diffusion with a bias (Cope's 
rule) [3, [H| , while its evolution near the lower limit is 
likely constrained by the need for relatively specialized 
morphological structures [l[ . We expect this latter effect 
to appear in fossil data as a systematic intensification of 
Cope's rule for very small-bodied species, i.e., increased 
(log A) as xa — * x m - ln . From ancestor-descendant size 
data for 1106 extinct North American terrestrial mam- 
mals , we estimated and compared three models of the 
distribution F(X) as a function of ancestor size, including 
the model suggested by Alroy [3| which predicts a mod- 
erately bi-modal distribution in body sizes. Of these, 
a piecewise model (Fig. 03), with no effective optimal 
body size, has the best empirical support (model selec- 
tion via likelihood ratio test and Bayesian information 
criterion; see Appendix IB 2p . This model includes both 
a strengthening of Cope's rule for small-bodied species 
(x < 32 g) and a small but uniformly positive bias for 
larger species, resulting in an average body-size growth 
of 4.1 ± 1.0% between ancestors and their descendants 
((log A) = 0.04 ±0.01). 

This result supports the existence of short-term selec- 
tive advantages for increased species body size, e.g., bet- 
ter tolerance of resource fluctuations, better thermoregu- 
lation, and better predator avoidance [j| , but also implies 
a more nuanced view: small-bodied species exhibit even 
greater selective advantages from increased size, e.g., be- 
cause of greater morphological flexibility. 

Empirical estimates of extinction rates (or equiva- 
lently, speciation rates) as functions of body size are 
uncertain [2o| . due to the bias and incompleteness of 
the fossil record. We partly control for this uncertainty 
by utilizing a simplistic model of extinction risk p e (x), 
largely estimated from the data, where extinction occurs 
independently with a probability calculated only from 
the species' size. We specified a basal extinction rate (3 
by assuming that the number of Recent terrestrial mam- 
mal species is close to a putative carrying capacity. We 
then let extinction ri sk p er unit time increase logarithmi- 
cally with body size [2f| (see Appendix I A 2j) . This model 
leaves only the rate p by which risk increases with size 
as a free parameter, which was chosen by minimizing the 
statistical distance between the simulated and empirical 
distributions (see Appendix IA 3|) . 

Inserting these three processes, as estimated above, 
into our computer model, we found that the model 
accurately predicted the distribution of Recent terres- 
trial mammal sizes over its seven orders of magnitude 
(Fig. [3]A_) , and was particularly accurate for small-bodied 
species {x < 80 g). Our sensitivity analysis further indi- 
cated that this prediction was highly robust to variations 
in most of the estimated parameters, but highly sensitive 
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FIG. 3: Simulated distributions of species body size (central tendency ± 95% confidence intervals from fOOO repetitions; all 
model parameters estimated as described in the text) and the empirical distribution of Recent terrestrial mammals. (A) The 
model described in the text. (B) The same model as A but with a bias (log A) that is independent of size. (C) The same 
model as B but with an extinction risk that is independent of size. (For details and additional results, see Appendix IT31) 



to the location of the lower-limit on body size. The esti- 
mated value of x m i n ~ 2 g, however, is the most strongly 
supported of all model parameters. Thus, even large 
revisions to the other parameter estimates are unlikely 
to change our general conclusions (see Appendix IC2|) . 
Also, although a range of p values produced size distri- 
butions that were statistically close to the empirical dis- 
tribution, the model predicts a particular extinction risk 
curve (Fig. |S4|) that could be tested with appropriate 
empirical data. 

To further discriminate among alternative explana- 
tions for the species size distribution, we tested sim- 
pler diffusion models, each with parameters estimated 
from fossil data (see Appendix [Djl . including (1) unbi- 
ased diffusion with a lower boundary, (2) Cope's rule 
with size-dependent extinction, (3) Cope's rule alone, 
(4) size-dependent extinction alone, and (5) a version of 
the full model that omits the increased bias for small- 
bodied species (x < 32 g). We found that these mod- 
els all predicted size distributions that differed, some- 
times dramatically so, from the empirical distribution 
(Figs. HP,|3P, M and [STO])- Additionally, we found that 
a positive bias (log A) > for large-bodied species is not 
necessary if the extinction risk increases less quickly (see 
Appendix IC 2[) . These results support the inclusion of a 
fundamental lower limit, the diffusion of species size, and 
an increasing risk of extinction with size, as well as an in- 
creased bias toward larger sizes for small-bodied species 
(z<32g). 

Thus, the shape of a body size distribution can be in- 
terpreted in the context of these three macroevolutionary 
processes. An intermediate location for the distribution's 
mode (40 g for terrestrial mammals) is mainly caused by 
diffusion in the vicinity of the physiological lower limit on 
body size - which prevents the smallest species from be- 
ing the most abundant. A heavy right-tail is then caused 
primarily by diffusion in the presence of extinction risks 
that increase weakly with size p > 0. For mammals, the 
within-lineage tendency toward increased size (Cope's 



rule, (log A) > 0) shifts the mode toward slightly larger 
sizes and slightly increases the heaviness of the right-tail. 

Under different conditions, these processes produce 
markedly different body size distributions. For instance, 
a long left-tail extending toward small-bodied species 
would indicate that the risk of extinction decreases with 
larger size p < 0. Similarly, a more symmetric distribu- 
tion would indicate both that extinction rates are rela- 
tively size-independent p w and that changes to body 
size convey few selective advantages (log A) « 0. Al- 
though a suitable body size distribution is not currently 
available for dinosaurs (but see [27]), evidence suggests 
that it may be more symmetric than for mammals. The 
right-skewed distribution's ubiquity, such as for insects 
and birds [l|, 0], suggests that such circumstances are 
rare, and that the mammalian distribution represents the 
norm. 

This model omits explicit mechanisms for many canon- 
ical ecological and microevolutionary processes, including 
the impact of inter-specific competition, geography, pre- 
dation, population dynamics, and size variation between 
speciation events (anagenetic evolution), which suggests 
that their contributions to the systematic or large-scale 
character of species body size distributions can be com- 
pactly summarized by the values of certain model param- 
eters, e.g., the strength of Cope's rule (log A) or the man- 
ner in which extinction risk increases with body size p. 
Some aspects of the body size distribution, however, are 
not explained by this model, such as the slight over- 
abundance of terrestrial mammal species around 300 kg 
and the slight under- abundance around I kg (Fig. [2K). 
Whether such deviations can be attributed to phyloge- 
netically correlated speciation and extinction events is an 
open question. A more thorough examination of these 
macroevolutionary processes may explain their particu- 
lar form and origin, and answer why body size is weakly 
correlated with increased extinction rates (or, decrease 
of speciation rates) weakly with body size, why physio- 
logical lower limits on body size exist and are conserved 
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within a taxonomic groups, and why some groups exhibit 
macroevolutionary trends but others do not. 
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APPENDICES 

These appendices document the technical details of our 
study. 

• Appendix lAl fully describes the cladogenetic model 
used to test our main hypotheses, including the 
model's specifications (Appendix I A lj) . the statis- 
tical estimation of the model parameters from the 
mammalian fossil data (Appendix I A 2p . and our 
score function for comparing the results of the 
model to empirical data (Appendix IA3[) . 

• Appendix[B]describes our model of species size vari- 
ation at speciation events, including a new analy- 
sis of the empirical evidence for Cope's rule (Ap- 
pendix IB ip and the estimation of the distribution 
F(X) of within-lineage changes to body size (Ap- 
pendix EU. 

• Appendix [C] presents supplementary results from 
simulating the model, including snapshots from a 
single simulation (Appendix IC 1|) , and the results 
of our analysis of the model's sensitivity to the es- 
timated parameters (Appendix IC 2[) . 

• Appendix [D] presents detailed comparisons of the 
model with simpler alternative diffusion models, 
several of which have previously been suggested as 
explanations of right-skewed size distributions. 

• Appendix [E] gives a complete Matlab-code imple- 
mentation of the model. 



APPENDIX A: A CLADOGENETIC DIFFUSION 
MODEL OF BODY SIZE EVOLUTION 

Complex theoretical questions about the evolution of 
body size, such as the ones we consider, are typically ex- 
plored with simulations. Such a choice is mainly driven 
by the fact that a mathematical analysis of branching 
processes is often intractable for all but the most sim- 
ple questions. On the other hand, poorly executed sim- 
ulation studies can be misleading as a result of incor- 
rect specification, among other reasons. We make a con- 
certed effort to avoid such problems by defining a model 
whose parameters can be estimated directly from fossil 
data prior to the late Quaternary, and whose output can 
be validated against data from the late Quaternary (Re- 
cent species). Although these two data sources are not 
logically independent, they are perhaps as close to inde- 
pendent as we might wish for such a macroevolutionary 
study. We note that while we mainly study the body size 
distribution of terrestrial mammals here, this framework 
can easily be adapted to other taxonomic groups, e.g., 
birds. 



1. Model specification 

As described in the main text, our model combines 
three simple mechanisms related to body size evolution. 
Each of these processes has been previously suggested or 
studied the literature, but are combined here in a coher- 
ent, quantitative framework that engages directly with 
empirical data. We now briefly describe the technical 
details of the three processes. 

1. The range of possible body sizes for a particular 
higher taxon, e.g., terrestrial mammals, obeys a 
lower limit x m i n . A limit like this was suggested 
in [l| on the basis that physiological factors, e.g., 
metabolic requirements, constrain how small a par- 
ticular body plan can become without fundamental 
innovation. (For convenience, we also assume that 
body size obeys an upper limit, but set this limit 
at an extremely large size, a; max = 10 15 g.) 

2. As is conventional, simulated time proceeds in dis- 
crete steps, each of which corresponds to a single 
event of cladogenesis. Although realistically, each 
cladogenetic event could produce a variable num- 
ber of descendent species, we present results only 
for the case where exactly two new species are cre- 
ated while the ancestor species becomes extinct. 
We note that several apparently reasonable varia- 
tions on this rule, e.g., creating one or more de- 
scendent species while letting the ancestral species 
continue, however, appear to produce equivalent re- 
sults. 

At each of these speciation events, each descen- 
dent species' body size xd varies from its ances- 
tor's body size x a according to a multiplicative ran- 
dom walk. That is, the size of a descendent is the 
product of its ancestor's body size and a random 
variable A, which represents the relative percentage 
change in body size due to all contributing factors. 
We then assume that the instantaneous distribu- 
tion of changes to body size F(X) for a given event 
has two main characteristics: (1) it is stable over 
evolutionary time (i.e., it is not a function of time t, 
although it may be a function of ancestor size xa), 
and (2) it always respects the aforementioned limits 
on body size. This latter requirement implies that 
for a given ancestor body size xa, the distribution 
of allowed changes to size F(X) is bounded on the 
interval {^ff- 7 ^rf*-]- Fig. ISTB illustrates this idea, 
showing how the support of the distribution varies 
as a function of body size. If (log A) ^ 0, then we 
say that F(X) is "biased," with a positive bias cor- 
responding to Cope's rule; if (log A) = 0, we say 
that F(X) is "unbiased." 

In the physics literature (see [28j]), this boundary 
effect is similar to an "absorbing boundary" con- 
dition in a diffusion- reaction equation, i.e., we re- 
quire that the probability density go to zero at the 
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FIG. SI: Results of modeling the evolution of body sizes for 4002 Recent terrestrial mammal species. (A) A schematic 
illustrating a simple cladogenetic model (see text) of species body size evolution, on the basis of a multiplicative diffusion 
process where the size of a descendant species xo is related to its ancestor's size xa by a multiplicative factor A. (B) Model 
of the distribution of within-lineage body size changes -FX A), where lower and upper boundaries on body size are enforced 
by letting setting F(X < x m in/aO = 0. Thus, as a lineage approaches a; m i n , the distribution increasingly favors changes in the 
opposite direction of the limit (inset: average change in log-body size, as a function of ancestral body size, with (log A) = 0, 
2; m in = 1.8 g and x max = 10 7 g). We incorporate a model of Cope's rule by letting the mean of this distribution (x(xa) vary 
as a function of xa, where [i{xa) is estimated from fossil data (see Appendix [B} . (C) Histogram of Recent mammal body 
sizes overlaid by an example distribution produced by the model (inset: corresponding complementary cumulative distribution 
functions). (D) The central tendency (with 95% confidence intervals) of the simulated distribution of species body sizes and 
the smoothed empirical distribution for 4002 Recent mammal species (Gaussian kernel). 



boundary, s(x) — at x — x mul . In contrast, 
a "reflecting" or "insulating boundary" would re- 
quire that the flux across the boundary be zero, 
ds/dx = at x = x mul . Unfortunately, these same 
terms have different meanings in the body size lit- 
erature (see [13)) thus, we avoid their use entirely. 

3. Species become extinct independently with a 
probability p e that depends only on species 
body size. We considered two functional forms 
for how this risk of extinction varies with 
body size: a power-law function of the form 
log 10 p e (x) = plog 10 x + log 10 0, where (3 is the 
baseline extinction rate and p is the rate at which 
the rate increases with log-body size, and a loga- 
rithmic function p e (x) = plog 10 x + ft. 

The notion that extinction risk increases with body 
size p > is a conventional one in the body 
size literature (26j . although most empirical docu- 



mentation of these notions concern relatively mod- 
ern species. As such, relatively little is known 
about speciation and extinction rates in the fos- 
sil record [25|, [3(|. However, as population size 
generally decreases with increased body size, the 
increased extinction risk could result from popula- 
tions of larger sized organisms being closer to in- 
viable population sizes. The result for this mech- 
anism is that one parameter - the rate at which 
extinction risk increases with body size p - remains 
free in our study. 

We note that an equivalent model would allow the 
speciation rate, or both extinction and speciation, 
to vary with body size. The absolute value of 
the speciation and extinction rates is not impor- 
tant 0, but rather their ratio is. For a discrete- 
time model, size-dependent extinction rates are sig- 
nificantly easier to work with. 
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FIG. S2: Analysis of 1106 pairs of mammal species in the North American fossil record (A) Descendent body size xd versus 
ancestor body size xa overlaid by the relation xo = xa, representing the null- hypothesis of no bias toward larger or smaller 
body sizes, i.e., (log A) — 0. The best-fit allometric relation log id = A logo; a for this body size data (by standardized major 
axis regression 29]) produces an estimated slope A = 1.02±0.1 (where ± indicates the 95% confidence interval; r 2 = 0.95). (B) 
Estimated density (Gaussian kernel) of the distribution F(X) of within-lineage changes to species body size (solid line; equivalent 
to distribution of vertical residuals in A), along with the maximum likelihood log-normal distribution (dashed). (C) Change 
in species body size A as a function of ancestor size (circles) overlaid with the best model of the form log X(xa) = Af[(i(xA), o" 2 ] 
(dashed lines). Under this model, changes in body size at speciation events are systematically biased toward larger sizes 
(Cope's rule); the bias is strongest for small bodied species, but still positive [^l(xa) = 0.04] for larger species x > 32 g. A 
likelihood ratio test indicates that this model is a better fit to the data than a model with no bias [h(xa) = 0] for larger species 
(p — 1.44 x 10~ 4 ; see Appendix IB 2|l . We note that this simple model is a more conservative one than a model that includes 
the heavy tails of the distribution shown in B. 



Only a few more words are necessary to complete our 
specification of the model. At each time step, one species, 
chosen uniformly at random from the extant set, under- 
goes cladogensis according to Rule 2. This action pro- 
duces two daughter species, one of which is new and the 
other of which replaces the ancestral species in the extant 
set. Subsequently, each extant species becomes extinct 
according to Rule 3; extinct species are removed from 
the extant set. Fig. IS11 A illustrates this branching pro- 
cess schematically. The model is initialized with a single 
founder species with body size xq, and proceeds for < max 
time steps (the number of steps is also the cumulative 
number of species produced). Fig. ISlB illustrates the 
form of Rule 2 that we use (see Appendix [B] for more 
details), where the largest change in body size is con- 
strained so that the result would be to produce a daugh- 
ter species with size x mm . Fig. IS1C shows an example 
of the resulting simulated distribution of species body 
sizes, where we have used the parameter values given in 
Table ISU and Fig. IS1D shows the central tendency of 
this model. 



2. Parameter estimation 

To implement this model on a computer, we must 
choose the form of each mechanism, e.g., F(X). Where 
possible, we estimated both the form and the correspond- 
ing parameters directly from fossil data; the only genuine 
free parameter in the model is p, the rate at which extinc- 
tion risk increases with size. In this section, we describe 
our methodology for estimating parameters for Rules 1 



and 3, the size of the founder species, and the number 
of species to simulate. The methodology for parameter- 
izing Rule 2 is slightly more involved and is described 
subsequently (Appendix [B]) . 

Rule 1 (boundaries) requires parameters to define a 
lower limit on body size. The most direct way to esti- 
mate these values is to consider fossil [Til H3 an d Re- 
cent [l8[ body size data. Each of these sources agrees 
that the minimum mammalian body size is in the neigh- 
borhood of cc m i n ~ 2g [e.g., both the Etruscan shrew (S. 
etruscus) and the bumblebee bat (C. thonglongyai) are 
in this range] . Experimental [22j and theoretical work Q 
on metabolism also supports a fundamental limit in this 
vicinity. The particular size of the founder species has lit- 
tle impact on the simulation results (see Appendix IC 2[) . 
and for convenience we choose it to be equal to the mode 
of the Recent distribution, xq — 40 g. 

Parameter estimates for Rule 3 (extinction rates) can 
be partially derived from existing fossil data. We esti- 
mate the baseline extinction rate (3 for terrestrial mam- 
mals in the following way. If the number of Recent ter- 
restrial mammals represents a roughly stable equilibrium, 
then for each cladogenesis event in the simulation there 
must be one extinction event, on average. (This equilib- 
rium assumption is not central to our results, and can be 
relaxed without impacting the fundamental nature of the 
model, so long as the total number of extant species grows 
slowly relative to the rate of species turnover.) Thus, the 
baseline extinction rate is simply (3 — 1/n, where n is 
the expected number of species at equilibrium. We let 
n = 5000, although its precise value is unimportant. By 
letting extinction rate increase with body size, the actual 
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FIG. S3: The same data as in Fig. lS2C along with a smoothed 
version (exponential kernel) showing the mean ± one standard 
deviation. The smoothed trend is quite similar to the piece- 
wise linear model that we fitted to the data via maximum 
likelihood (see Appendix IB 2|) . 



parameter 


value 


source Ref. 


lower bound 




1.8g 


[18,19, 21J 


founder body size 


2?0 


40g 




species at equilibrium 


n 


5000 




baseline extinction rate 


(3 


l/n 


— 


rate of extinction increase 


P 


0.025 


— 


mean species lifetime 


V 


1.60(1) My 


[19, 21] 


years in equilibrium 


T 


60 My 


[19] 


log A-intercept 


Cl 


0.33 


[19| 


log ^-intercept 


C2 


1.30 


[19] 


systematic bias 


5 


0.04 


[19] 


variance 


a 


0.63 


[19] 


power-law tail 


a 


3.3(1) 


[19] 



TABLE SI: Cladogenetic simulation parameters, their esti- 
mated values and the data sources from which the estimates 
were derived. The parameters can be grouped according to 
mechanism: the physiological lower limit of the terrestrial 
mammalian body size (a; m i n ); the distribution F(X) of within- 
lineage changes to body size (ci, C2, 8, a and a), where 5 
denotes the systematic bias away from smaller body sizes 
(Cope's rule) and a and C2 denote the additional bias for 
small-bodied species; the initial conditions and duration of 
the simulation (xq, t, v and n). 



number of species at equilibrium n cq will be somewhat 
less than this number. If the true number of terrestrial 
mammal species is substantially greater than our current 
estimate of roughly 5000, or if the assumption of equilib- 
rium is incorrect, then the extinction probability curve 
can be rescaled by lowering the baseline extinction rate, 
which does not effect other aspects of the simulation such 
as the overall shape of the distribution. 

We estimate the length of the simulation i max by esti- 
mating the total number of mammalian species since the 
Cretaceous- Tertiary boundary. We estimate this num- 
ber as i max = TTt/v, where t is the number of years of 
equilibrium, v is the average duration or lifetime of a 
species, and n is the number of species at equilibrium. 
We let r ~ 60 My, although its precise value has lit- 
tle impact on the results of the simulation. Estimates 
of the average duration of a species, however, vary quite 
widely depending on the data used. In the Alroy data 
set, v — 2.32(8) My (n = 1703; the parenthetical value 
denotes the standard error in the last digit), while in the 
NOW data set, v = 1.52(1) My (n = 14099). We esti- 
mate v be the average of these: v — 1.60(1) My, although 
its exact value is not important (see Appendix IC 21 and 
Fig.EZ}. 

Finally, we estimate the value of p by numerically min- 
imizing the distributional distance (see Appendix IA 3p 
between the model and the empirical data for terrestrial 
mammals (Fig. IS4I A). In general, we report results for 
the power-law model of extinction risk; the fitted value 
of p in the logarithmic model is such that the two risk 
curves are almost identical (see Fig.lS4B). indicating that 
the functional form is not important - both models re- 
sult in a close-to-linear increase in extinction risk with 
log-size such that the risk of extinction at each step for 



the largest species is 56 — 58% larger than the basal ex- 
tinction risk (32 — 34% for F(X) with log-normal tails). 
When spread over six or seven orders of magnitude, this 
causes a slight, positive dependence of extinction risk on 
body size. We note that the form of this curve provides 
a testable prediction of the model. 



3. Scoring the quality of the model 

The output of the simulation is a set of species body 
sizes. To evaluate the quality of this set relative to the 
empirical data on terrestrial mammals, we use a distance 
measure for statistical distributions, the tail- weighted 
Kolmogorov-Smirnov (wKS) goodness-of-fit statistic [3l[ 



wKS = max ■ 



\S(x)-P(x)\ 

y/P( X )[l-P(x)] 



(Al) 



where S(x) is the cumulative distribution function (CDF) 
of the simulated data and P(x) is the CDF of the empir- 
ical data. This statistic is independent of any particular 
binning scheme and thus gives a relatively general char- 
acterization of the dissimilarity of two distributions by 
measuring the maximum absolute deviation between the 
simulated and empirical cumulative distributions. Very 
small values (wKS < 0.3) indicate a strong closeness, for 
all values of x. In Fig. IS1C for instance, wKS w 0.17. 

Some readers may be familiar with the more commonly 
used Kolmogorov-Smirnov (KS) goodness-of-fit statistic. 
The tail- weighted version differs by giving equal weight to 
all parts of the distribution, and particularly the tails. In 
contrast, the traditional KS statistic effectively weights 
the area near the median of the distribution the most, 



9 



and thus can underestimate strong differences in the tails. 
This causes the tail- weighted version to be more difficult 
to minimize - all parts of the simulated distribution must 
be close to the empirical one, not just the middles of 
the distributions. We have tried using both statistics to 
score the quality of the model results, and we find that 
numerically minimizing the tail-weighted version chooses 
values of p that produce significantly more convincing 
results for larger-bodied species, e.g., x > 10 4 g. 

Finally, because the model produces a dynamic equi- 
librium in the species body size distribution, to evaluate 
its typical behavior, and to prevent transient effects from 
skewing our quality scores, we average the wKS statistic 
over regularly spaced intervals in the last 15 My of sim- 
ulated time. When we evaluate the quality of a set of 
parameter values, we further average this value over sev- 
eral hundred independent trials. 



APPENDIX B: CHANGES TO BODY SIZE AND 
COPE'S RULE 

Rule 2 represents the manner in which body sizes vary 
at speciation events. Phylogenetic body size data for a 
wide range of terrestrial mammals would be the preferred 
way to determine the best model of within-lineage body 
size variation, but such ancestor-descendent data is not 
currently available for a sufficiently large and diverse set 
of terrestrial mammals. Instead, we use Alroy's putative 
ancestor-descendant data, reconstructed from fossil data 
for North American mammals, as a proxy. This data has 
been used in several previous studies of within-lineage 
variation of body size [lj, 0j|, and details of the non- 
phylogenetic reconstruction process for the 1106 pairs of 
terrestrial mammals species are given there. From this 
data, we estimate a parametric model for F(X). 

The non-phylogenetic nature of this data, however, im- 
plies that there are likely to be several inversions of ances- 
tors and descendants, as well as several incorrect pairings 
of ancestors with descendants. Fortunately, the statisti- 
cal nature of our analysis implies that so long as the 
number of putative pairs is relatively large, such errors 
will not obscure the true average log-change, which is 
precisely the aspect of this data most important to our 
study. Further, our sensitivity analysis indicates that the 
precise details of the inferred model, e.g., the average and 
variance, do not matter much with regard to our overall 
conclusions (see Appendix IC 2[) . so long as a log-normal 
model of change is a relatively good model of the data. 



1. Empirical evidence for Cope's rule 

Empirical evidence for and against Cope's rule has 
been studied in a variety different taxonomic groups [Til . 
[H, US, [H, [H, [H, [H. For terrestrial mammals, the evi- 
dence is relatively strong, with Alroy's study [14| showing 
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FIG. S4: (A) Estimation results for fitting the free param- 
eter p in the power-law model of extinction risk, in two al- 
ternative cases, one where the distribution F(X) of within- 
lineage changes to body size has log- normal tails (blue), and 
one where the tails decay as a power (red). Similar results 
are obtained under the logarithmic model of extinction risk. 
All other parameters take the values given in Table [Si] For 
clarity, we also plot a smoothed trend (exponential kernel) 
over the sampled data. Each point is the average goodness- 
of-fit (wKS), for the last 15 My of the simulation, over 50 
independent trials. (B) The fitted extinction-risk curves for 
models of F(\) with power-law and non-power-law tails, and 
for models where the extinction risk increases as a logarithm 
or power of size (see Appendix I A II Rule 3). The similarity of 
the curves between these two extinction models shows that a 
generally log-linear form is sufficient. 



a slight systematic positive bias (log A) > 0, with descen- 
dants tending to be slightly larger than their ancestors. 

In order to specify Rule 2, however, we need to know 
not only whether there is a positive bias or not, but how 
strong is the bias as a function of ancestor size. This can 
be done by directly estimating the shape of F(X) as a 
function of ancestor size. Thus, we conduct a new anal- 
ysis of the previously studied ancestor-descendant data. 
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FIG. S5: Snapshots of the simulated species body size distribution, relative to the empirical distribution, from a single simulation 
trial, taken at n — {505,2005,30005, 100005} total species (A, B, C and D, respectively). For clarity, the insets show the 
corresponding complementary cumulative distribution functions. 
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FIG. S6: The time series of wKS statistics for the simulation 
in Fig. IS5I The bold circles indicate the positions and scores 
of the four snapshots. 

Fig. IS2I A shows descendant body size xd as a func- 
tion of ancestor body size xa, for Alroy's fossil data on 
North American mammals, and illustrates that descen- 
dants tend to be roughly the same size as their ancestors. 
The best- fit allometric relation [29j log-r_D = XlogXA to 
these data yields A = 1.02 ± 0.01 (estimate ±95% con- 
fidence), indicating a small but systematic tendency for 
descendants to be slightly larger than their ancestors. 

Fig. IS2B shows the distribution of within-lineage 
changes in body size (equivalent to the vertical residuals 
to the line xd = xa in Fig.lS2"Rl, with increases (615) be- 
ing only slightly more common than decreases (488; the 
remaining 3 cases are instances of no-change). Denoting 
A as the multiplicative change in body size from ancestor 
to descendant, we find that the overall average change 
is toward larger sizes, with (log A) = 0.047 ± 0.009. This 
estimate ignores, of course, the possibility that the aver- 
age change depends on the ancestor size. 

The conventional assumption in simulation studies of 
body size evolution is that F(X) follows a log-normal dis- 
tribution. We find that the data are consistent with 



this assumption; however, we note that the data are 
also consistent with a log-normal double Pareto distri- 
bution [37| - a log-normal distribution with tails that 
decay as power-laws (or, that decay as exponentials in 
log A). We test this hypothesis using standard statisti- 
cal techniques for power-law distributions [381 ]. and find 
that the tails of the distribution can be assumed to be 
symmetric [negative tail: a = 3.4(2), p = 0.83(3); pos- 
itive tail: a = 3.3(2), p = 0.79(3); both tails together: 
a = 3.3(1), p = 0.96(3)]. For completeness, we consider 
both models of F(X) in our sensitivity analysis, and find 
relatively small differences between the results (but see 
Appendix |D|) . 

2. Our model of changes to body size 

In this section, we describe a model-selection analysis 
among three alternative models of within-lineage changes 
to body size -F(A), all of which are drawn from a log- 
normal distribution where the average log-change to size 
p depends on the ancestor's size xa- In this way, F(X) 
can model both the effect of Cope's rule on large-bodied 
species and the effects of constrained evolution near the 
lower limit of body size on real mammalian evolution 
(above and beyond the form imposed by respecting the 
lower limit in Rule 2). This latter effect we call the small- 
bodied bias. For these three models, we ask which has 
the best empirical support from the putative ancestor- 
descendent data. 

1. Model one is a piece- wise form in which a bias to- 
ward larger sizes for small-bodied species decreases 
as a power of body size to a constant value i5 for 
large-bodied species (Fig. IS2C). 

2. Model two is identical to model one but sets the 
large-body bias parameter 5 to zero. 

3. Model three is a function u that follows the best-fit 
cubic polynomial (see [14|). 

All models have the form log X(xa) = M\jx{xa), <J 2 ] 
that is, log A is normally distributed with constant vari- 
ance a and a mean /i that varies as a function of body 
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FIG. S7: Sensitivity analysis of the quality of the simulated species body size to variations in the values of the model parameters 
estimated from data (Table [Si}. Each figure shows the results for the model F(\) with (red squares) and without (blue circles) 
power-law tails; for clarity, we also plot a smoothed trend (exponential kernel) over the results. Each point denotes the (wKS) 
statistic, averaged over the last 15 My of the simulation and over 100 independent trials. Further, because p is a free parameter 
of the model, for each point, we re-estimated p as the value that gave the minimum (wKS) (over 100 independent trials), given 
the choice of the parameter in question, with all other parameters being held fixed. 



size xa, where the particular functional form of \x (x a) 
varies from model to model. In the first two cases, we 
use a simple piece-wise linear function: 

/ \ f (ci/c 2 ) \ogx A + ci + 8 if logXA < c 2 , 
^ ' 1 r5 otherwise, 

; (bi) 

where c\ is the y-intercept and c 2 is the x-intercept of the 
size-dependent bias for small bodies, and 5 is the magni- 
tude of the systematic positive bias for larger species. 
Thus, ci controls the strength of the small-body bias 
and c 2 controls the range over which this bias decays; 
their ratio — ci/c 2 gives the power by which the bias to- 
ward larger descendants decreases with increasing ances- 
tor size. When <5 = (model two), there is no systematic 
bias toward larger bodied species; a bias toward larger 
descendants (Cope's rule) is modeled by 5 > (model 
one). In the third case, we let h{xa) be the best-fit third- 
order polynomial to the ancestor-descendent data [13]; 
this function crosses the x-axis in three places, implying 
the existence of two "optimal" body sizes, one for small- 
bodied and one for large-bodied species. In all cases, we 
estimate the free parameters of these models from the 
data using maximum likelihood. 

Although each of the three models fits the data reason- 
ably well (p > 0.1 under a standard parametric bootstrap 
test [39j]), the data is closest to model one [likelihood ratio 



test (LRT) [13], |log(ii/£ 2 )| = 7.226, p = 1.44 x 1CT 4 
and I log(£i/£3)| = 1-001, p = 0.84, with similar results 
for a Bayesian Information Criterion (BIC) comparison]. 



Fig. IS2C shows the fitted form of the best model, 
where the strength of Cope's rule is S = 0.04 ± 0.01 for 
xa > 32 g (or, an average growth of 4.1 ± 1.0% per spe- 
ciation event), along with the raw ancestor-descendent 
data. This model is visually very similar to a smoothed 
version of the data [4l[ , shown in Fig. IS31 Although these 
results suggest that our estimated model is a good sum- 
mary of the data, the data themselves could be biased 
in several ways. A more robust analysis would combine 
the likelihood ratio test approach employed here with an 
appropriate model of the errors and bias, were such an 
error-model known for this kind of data. 



Finally, we note that the fitted power-law model of 
the bias toward larger descendants for small-bodied an- 
cestors has an exponent 7 w —1/4, which may or may 
not be related to the prevalent quarter-power scaling in 
ecology [Hj]. 
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boundary 
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dependence 




Model description 


(wKS) 


^min 


S 


Cl 


C2 


P 


Fig. 


Full model 


0.181(1) 


1.8 


0.04 


0.33 


1.30 


0.025 


IS9R 


Full model with 
no small-size bias 


0.244(1) 


1.8 


0.04 





0.25 


0.023 


|S9B 


Unbiased diffusion 
with lower bound 


2.97(3) 


1.8 








0.25 





MP 


Cope's rule with size- 
dependent extinction 


10.60(7) 


io- 8 


0.04 





-8 


-0.002 


[slop 


Cope's rule alone 


11.72(9) 


io- 8 


0.04 





-8 





IsToe 


Size-dependent 
extinction alone 


10.37(6) 


10~ 8 








-8 


-0.005 
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TABLE S2: A comparison of the full model described in Appendix I A II with five simpler models. Each of these alternatives are 
special cases of the full model and many have been discussed in the literature (see @,0|) as methods for generating right-skewed 
size distributions. Each model was run 1000 times, from which we computed the central tendency of the simulated distribution 
(shown in Figs. IS9land lST0)) and the average statistical distance (wKS) from the empirical distribution. Results reported here 
are for F(\) with power-law tails and the power-law model of extinction risk (similar results for log-normal tails or logarithmic 
extinction risk); the standard error in the last digit is quoted parenthetically. For models with p 7^ 0, p was estimated by 
minimizing (wKS). 



APPENDIX C: ADDITIONAL MODEL RESULTS 
AND ANALYSIS 

1. Simulation results 

To convey some notion of how the simulation devel- 
ops the species body size distribution over time, Fig. [S5l 
shows snapshots of simulated data, along with the empir- 
ical data, taken from a single run of the simulation. Ini- 
tially, the simulated distribution is concentrated around 
the size of the founder species xq, but, over time, the 
distribution's right tail lengthens considerably until the 
simulated distribution is very close to the empirical one, 
for all body sizes. After approximately 30 000 total simu- 
lated species (Fig.lS5"C). the agreement between the sim- 
ulation and data is already relatively good (wKS = 0.37), 
with the main disagreement being for the largest-bodied 
species. By this point, the disagreement for small- and 
intermediate-bodied species is very small. Fig. IS6I shows 
the corresponding time scries of the wKS statistic over 
simulated time, for the same simulation. 



2. Sensitivity analysis 

We tested the dependence of our results on the particu- 
lar estimated parameter values by conducting a thorough 
sensitivity analysis that varied each parameter indepen- 
dently over a wide range of values. For each of these al- 
ternative parameterizations, we re-estimated the value of 
the free parameter p (by repeating the calculation shown 
in Fig. IS4R). Fig. [S7l shows the results of these tests, for 
two models of F(X), one with log-normal and one with 
power-law tails. Results from the logarithmic model ex- 
tinction risk are omitted as they are virtually indistin- 
guishable from the results from the power-law model. 



Typically, the precision of the simulated distribution 
is highly robust to variations in the estimated values of 
most model parameters, with wKS < 0.3 and deviations 
appearing only in the extreme tails or in the 1 kg or 300 kg 
ranges. In particular, the precision is highly insensitive 
to the size of the founder species Xo or the length of 
the simulation (parameterized by the average lifetime of 
a species r), and only mildly sensitive to the variance 
in the diffusion process a. Somewhat greater sensitivity 
is seen for the strength of Cope's rule <5, although both 
positive and negative values both produce good fits to 
the data. The most sensitive parameter is the value of 
the lower limit x m i n , with good fits only being produced 
when x min « 2 g. 

We performed a second sensitivity test to probe the 
connection between the strength of Cope's rule 5 and the 
rate of increasing risk from extinction for larger bodied 
species p. By systematically varying these two param- 
eters, we find that the particular shape of the right-tail 
of the empirical distribution can only be produced when 
these two parameters co-vary in a very regular fashion. 
Fig. IS8I shows the results of this experiment, where we 
choose two different values of v (average species lifetime) 
and two different forms for F(X) (as before, log-normal 
or power-law tails). 

We interpret these results in the following way. The 
greater the short-term selective benefits derived from in- 
creased species body size, the more species tend to have 
larger body size, at the expense of smaller body size. If 
the increased risk of extinction from increased body size 
does not increase in a related way, then the distribution of 
species body sizes becomes more heavily weighted toward 
large-bodied species. If, however, the risk of extinction 
increases proportionally to the increased benefits of body 
size, then the size distribution's steady-state remains un- 
changed. 
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APPENDIX D: COMPARISON WITH SIMPLE 
DIFFUSION MODELS 

Less complex diffusion models have also been suggested 
as possible explanations of right-skewed (on a log-scale) 
species body size distributions (see @, H[). The model 
described in Appendix I A II naturally generalizes many of 
these models, and thus allows us to easily ask whether 
any of these simpler models are also adequate explana- 
tions of the empirical distribution. 

In particular, we consider (1) unbiased diffusion with 
a lower boundary, (2) Cope's rule with size-dependent 
extinction, (3) Cope's rule alone, and (4) size-dependent 
extinction alone. Additionally, we consider (5) a simpli- 
fied version of the full model that omits the increased bias 
toward larger descendants for small-bodied species near 
x m in, i.e., a model in which (i(xa) = 8 rather than the 
more complex form given in Eq. (|B1|) (see Appendix lB 2[) . 
Simulation results for the full model and each of these five 
models are shown in Figs. lS9l and lS101 The results of the 
experiments are summarized in Table IS21 

For each model, we repeated the simulation 1000 times 
to compute the simulated distributions' central tenden- 
cies (as in Fig. IS1D ). We also calculated the average dis- 
tributional distance (wKS) from the empirical distribu- 
tion, which we used to rank-order the models in terms 
of their accuracy. For models that included the mecha- 
nism for size-dependent extinction (i.e., when p ^ 0), we 
re-estimated p by minimizing (wKS) relative to the em- 
pirical distribution. In general, except as specified in Ta- 
ble IS21 the parameters of the included mechanisms were 
set according to our estimates from fossil data (Tablc lSTj) . 

The results of this exercise indicate that the full model 
is the best explanation of the empirical distribution for 
terrestrial mammals ((wKS) = 0.181, Fig. IS91 M. re- 
producing the entire distribution quite accurately, with 
the exception of significant deviations near 1 kg and 
300 kg. We note that none of the alternative models 
could reproduce these deviations. Further, only the full 
model, which includes the increased bias for small-bodied 
species, accurately reproduces the left-tail of the empir- 
ical distribution. All other models, including the model 
that omits only this behavior but is otherwise identical 
to the full model (Fig. H59B), overestimate the number of 
species with size x < 40 g. To be clear, the lower limit 
on body size itself causes the left-tail of the simulated 
distribution to decay somewhat like that of the empiri- 
cal distribution, but only by including the increased bias 
for small-bodied species, inferred from fossil data (Ap- 
pendix [B2J, do the tails coincide. 

The second best model is the one that omits the small- 
size bias ((wKS) = 0.244, Fig. E^). This model, how- 
ever, fails to accurately reproduce the left-tail of the em- 
pirical distribution; the fit to the right-tail is largely un- 
affected. The third-best model is unbiased diffusion in 
the presence of a lower boundary but without a size- 
dependent extinction risk ((wKS) = 2.97, Fig.|g9"tJ). This 
model produces distributions with a heavy right-tail and 



a steep decline in density near x m ; n , but dramatically 
misestimates the number of large-bodied species (too 
many for F(X) with power-law tail, and too few for F(X) 
with log- normal tails), and the number of species near the 
modal size x « 40 g. This model also has the possibly 
undesirable feature of no steady-state. That is, the more 
time has passed, the heavier the distribution's right-tail, 
and the larger the largest extant mammal, becomes. This 
implies that the similarity of the simulated and empirical 
distributions, in this case, depends strongly on the mean 
species lifetime v and the length of the simulation r. 

The three models with no lower limit a; m i n failed to 
produce distributions remotely close to the empirical 
one, with (wKS) > 10.6 in all cases, and typically pro- 
duced an over-abundance of extremely small species (e.g., 
x < 0.01 g). It may be possible to improve these re- 
sults by altering some model parameters far beyond the 
values estimated from fossil data, e.g., significantly in- 
creasing the strength of Cope's rule S and the extinction 
risk at larger sizes p to drive small-bodied species toward 
larger sizes. Alternatively, more complex mechanisms 
may also improve the results of these simple models, e.g., 
an extinction-risk curve that increases weakly above, and 
strongly below, x w 40 g would partly mimic the effect 
of a hard lower limit; using a more complex F(X) can 
certainly produce apparently complicated distributions 
(e.g., [H); etc. 

Thus, all three processes - a fundamental lower limit, 
the diffusion of species size, and an increasing risk of 
extinction with size - are necessary to reproduce the em- 
pirical distribution of Recent terrestrial mammals, and 
models that omit either the lower limit x m i n or extinction 
risks that increase with body size never produce realis- 
tic distributions, when using parameter estimates drawn 
from fossil data. Further, we found that an increased bias 
toward larger sizes for small-bodied species (x < 32 g) 
is necessary to reproduce the particular shape of the 
empirical distribution's left-tail (small-bodied species); 
without this increased bias, the model consistently over- 
estimates the number of species near the lower- limit. Fi- 
nally, we found that a systematic relationship between 
the strength of Cope's rule S and the rate at which ex- 
tinction risk increases p is necessary to produce realis- 
tic body size distributions, such that an increase in the 
short-term benefit of increased size can be balanced by 
a comparable increase in the long-term risk of extinction 
from increased size. 



APPENDIX E: SIMULATION CODE 

This simulation code is written in the Matlab program- 
ming language. It requires no additional toolboxes to 
run, and should be compatible with all recent versions of 
the software. 

°/ simulation parameters 
xmin =1.8; */« lower bound 
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xO 


40; 


7. 


founder body size 


n 


5000 ; 7. 


num. species at equilbrium 


beta = 


1/n; 


7. 


baseline extinction rate 


rho 


0.025; 7. 


rate of extinction increase 


nu 


1.6; 


7. 


mean species lifetime (My) 


tau 


60; 


1 


total simulation time (My) 


c(l) = 


0.33 


7. 


log-lambda intercept 


c(2) = 


1.30 


7. 


log-size intercept 


delta = 


0.04 


7. 


systematic bias (Cope's rule) 


sigma = 


0.63 


7. 


variance 


alpha = 


0.30 


7. 


power-law tail 


'/, data 


structure 


set up 


tmax = 


round((tau/nu)*n) ; 


xmax = 


10~15; 




x = 


-Inf*ones(ceil(1.5*n) ,1) ; 


x(l) = 


xO; 






kdt = 


5000; 






ns 


1; 






nk 


0; 






kd 


1; 






f _stop 


= 0; 






7» begin main 


loop 


while 


f _stop 





'/, begin cladogenesis step 

pair = [ceil (ns*rand(l) ) ns+1] ; 

mass = x(pair (1) , 1) ; 

LI = mass/xmin; 7, lower bound 

L2 = xmax/mass; 7. upper bound 

7. model of Cope's rule 
if logl0(mass)<c(2) 

7o increased bias for small sizes 

mu = (-c(l)/c(2))* . . . 

logl0(mass)+c(l)+delta; 

else 

7, uniform bias for large sizes 



mu = delta; 

end; 

7. Monte Carlo draw of growth factors 
tt = [0 0]; 

while any(tt<l/Ll I tt>L2) 

7o F( lambda) with power-law tails 
tt = exp(randn(2, l)*sigma+mu) .* ... 
((rand(2,l) .* . . . 

(1-1. /Ll)+1. /LI) ."alpha) ./ ... 
((rand(2,l) .* . . . 

(1-1. /L2)+l./L2) ."alpha) ; 

end; 

x(pair) = mass.*tt; 
kd = kd+2; 
ns = ns+1; 

7. end cladogenesis step 

7o begin extinction step 

7o power-law model of extinction risk 

kl = rand(ns ,1) < ... 

10. ~(rho*logl0(x(l:ns))+logl0(beta)) ; 
if sum(kl)>0 

x(l:sum(~kl)) = x(~kl) ; 

x(sum(~kl)+l :ns) = ... 

repmat( [-Inf] ,sum(kl) ,1) ; 

ns = sum(~kl) ; 

nk = nk+sum(kl) ; 

end; 

7o end extinction step 

% begin check stop-criteria 
if kd>=tmax, f_stop = 1; end; 
"L end check stop-criteria 

end; 

7. end main loop 
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FIG. S8: Sensitivity analysis for power-law tails in the distribution of body-size changes and the average lifetime of species v. 
In each case, we systematically varied both the strength of Cope's rule 5 and the strength of extinction for larger body sizes p, 
and computed the average goodness-of-fit to the empirical distribution function, for the last 15 My of the simulation, over 100 
independent trials. In each figure, we circle the region of parameter space that provides the best fit to the data, (wKS) < 0.25. 
(A, C) show results for using a log-normal distribution with power-law tails (also known as a log-normal double Pareto); (B, 
D) show results for the same log-normal distribution but without power-law tail. (A, B) show results for v = 2.3 My, while 
(C, D) show results for v — 1.0 My. Notably, the model with log-normal tails has a much more narrow range of parameter 
values that provide good fits to the data. For the models with power-law tails, an extinction parameter of 0.02 < p < 0.03 
provides the best fit to the data for the particular strength of Cope's rule we estimated from fossil data S = 0.04, regardless of 
how long the simulation is run. 
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FIG. S9: A comparison of the full model described in Appendix I A II with several simpler models, all of which have a lower 
boundary a; m i n . Results are presented in pairs, showing results for F(X) with power-law tails (left) and without (right). In all 
cases, model results show the central tendency of the model (over 1000 repetitions) with 95% confidence intervals. (A) The 
full model as described in the text, with all parameters as set in Table [Sll (B) The same model as in A, but with no increase 
in (log A) for small-bodied species. (C) The same model as in B, but also with no size-dependent extinction risk and without 
Cope's rule for large-bodied species ((log A) = 0), i.e., a model of unbiased diffusion with a lower bound. Table [S2l summarizes 
these results and gives the specific parameter settings used. 
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FIG. S10: As in Fig. IS91 results for simpler models are presented in pairs, showing F(\) with power-law tails (left) and without 
(right). All models shown here effectively have no lower boundary on size, i.e., we set x mln — 10 -8 g. (D) The model described 
in the text, but with no increase in (log A) for small-bodied species, i.e., a model with Cope's rule for all species and with 
size-dependent extinction risk. (E) The same model as in D, but with no size-dependent extinction risk, i.e., a model of Cope's 
rule alone. (F) The same model as in D, but without Cope's rule, i.e., a model with size-dependent extinction risk alone. 
Table [S2] summarizes these results and gives the specific parameter settings used. 



